Plant promoter sequences and methods of use for same

ABSTRACT

The invention discloses novel promoter sequences capable of expressing genes in plant cells. The promoters include engineered versions of the maize ubiquitin promoter to increase expression levels beyond those observed with the native ubiquitin promoter and alter the tissue preference. Expression constructs, vectors, transgenic plants and methods are also disclosed.

This application is a continuation of U.S. application Ser. No.09/590,558, filed Jun. 9, 2000, now abandoned.

FIELD OF THE INVENTION

This invention relates generally to the field of plant molecular biologyand in particular to engineered promoter sequences and their combinedarrangement within a promoter region such that expression of anexpression construct is enhanced in a plant cell.

BACKGROUND OF THE INVENTION

Gene expression encompasses a number of steps originating from the DNAtemplate, ultimately to the final protein or protein product. Controland regulation of gene expression can occur through numerous mechanisms.The initiation of transcription of a gene is generally thought of as thepredominant control of gene expression. Transcriptional controls (orpromoters) are generally short sequences embedded in the 5′-flanking orupstream region of a transcribed gene. There are promoter sequenceswhich affect gene expression in response to environmental stimuli,nutrient availability, or adverse conditions including heat shock,anaerobiosis or the presence of heavy metals. There are also DNAsequences which control gene expression during development, or in atissue, or in an organ specific fashion, and, of course there areconstitutive promoters.

Promoters contain the signals for RNA polymerase to begin transcriptionso that protein synthesis can proceed. DNA binding, nuclear, localizedproteins interact specifically with these cognate promoter DNA sequencesto promote the formation of the transcriptional complex and eventuallyinitiate the gene expression process. The entire region containing allthe ancillary elements affecting regulation or absolute levels oftranscription may be comprised of less than 100 base pairs or as much as1 kilobase pairs.

One of the most common sequence motifs present in the promoters of genesis the “TATA” element which resides upstream of the start oftranscription. Promoters are also typically comprised of componentswhich include a TATA box consensus sequence at about 35 base pairs 5′relative to the transcription start site or cap site which is defined as+1. The TATA motif is the site where the TATA-binding-protein (TBP) aspart of a complex of several polypeptides (TFIID complex) binds andproductively interacts (directly or indirectly) with factors bound toother sequence elements of the promoter. This TFIID complex in turnrecruits the RNA polymerase II complex to be positioned for the start oftranscription generally 25 to 30 base pairs downstream of the TATAelement and promotes elongation thus producing RNA molecules.

In most instances sequence elements other than the TATA motif arerequired for accurate transcription. Such elements are often locatedupstream of the TATA motif and a subset may have homology to theconsensus sequence CCAAT.

Promoters are usually positioned 5′ or upstream relative to the start ofthe coding region of the corresponding gene, and the entire regioncontaining all the ancillary elements affecting regulation or absolutelevels of transcription may be comprised of less than 100 base pairs oras much as 1 kilobase pair.

A number of promoters which are active in plant cells have beendescribed in the literature. These include nopaline synthase (NOS) andoctopine synthase (OCS) promoters (which are carried on tumor inducingplasmids of Agrobacterium tumefaciens) The cauliflower mosaic virus(CaMV) 19S and 35S promoters, the light-inducible promoter from thesmall subunit of ribulose bisphosphate carboxylase (ssRUBICSO, a veryabundant plant polypeptide), the alcohol dehydrogenase (AdhI and AdhII)promoters from maize, and the sucrose synthase promoter. All of thesepromoters have been used to create various types of DNA constructs whichhave been expressed in plants. (See for example PCT publicationWO84/02913 Rogers, et al). Perhaps the most commonly used promoter isthe 35S promoter of Cauliflower Mosaic Virus. The (CaMV) 35S promoter isa dicot virus promoter, however, it directs expression of genesintroduced into protoplasts of both dicots and monocots. The 35Spromoter is a very strong promoter and this accounts for its widespreaduse for high level expression of traits in transgenic plants. TheCaMV35S promoter however has also demonstrated relatively low activityin several agriculturally significant graminaceous plants such as wheat.

The promoters of the maize genes encoding alcohol dehydrogenase, AdhIand AdhII, have also been widely used in plant cell transformations.Both genes are induced after the onset of anaerobiosis. Maize AdhI hasbeen cloned and sequenced as has been AdhII. Formation of an AdhIchimeric gene, Adh-CAT comprising the AdhI promoter linked to thechloramphenicol acetyltransferase (CAT) coding sequences and nopalinesynthase (NOS) 3′ signal caused CAT expression at approximately 4-foldhigher levels at low oxygen concentrations than under controlconditions. Sequence elements necessary for anaerobic induction of theADH-CAT chimeric have also been identified. The existence of anaerobicregulatory element (ARE) between positions −140 and −99 of the maizeAdhI promoter composed of at least two sequence elements at positions−133 to −124 and positions −113 to −99 both of which have found to benecessary and are sufficient for low oxygen expression of ADH-CAT geneactivity. The Adh promoter however responds to anaerobiosis and is not aconstitutive promoter, drastically limiting its effectiveness.

Yet another important promoter in plants is the maize ubiquitin promoterwhich is described in U.S. Pat. No. 5,510,474, to Quail et al. thedisclosure of which is incorporated herein by reference (SEQ ID NO:15).This promoter has become widely used in transgenic plant protocols. Thepromoter, as described in the patent, comprises RNA polymeraserecognition and binding sites, a transcriptional initiation sequence(cap site), regulatory sequences responsible forinducible-transcription, an untranslatable intervening sequence (intron)between the transcriptional start site and the translational initiationsite, and two overlapping heat shock consensus promoter sequences 5′(−214 and −204) of the transcriptional start site. The entire promoteris almost 2 kb in length and has been shown to be functional in bothmonocot and dicot plants. The sequence of the maize ubiquitin promoteris disclosed in Quail et al. Expression levels achieved with theubiquitin (Ubi-1) promoter driving the CAT gene in oat protoplast cellswere higher than those of the CaMV promoter (Quail et al.).

There is a continuing need in the art for high level expressionpromoters, as well as promoters which are spatially defined in theirexpression patterns.

Expression of foreign nucleotide sequences introduced to cells mustachieve more than a basal expression rate to produce enough protein toeffect the desired phenotype or to harvest from the cell.

It is a primary object of this invention to provide novel maize Ubi-1promoter sequences that increase expression of introduced genes in plantcells and plant tissues, compared to the non-engineered promoter.

It is yet another object of the invention to provide promoter sequenceswhich result in expression in transgenic plants which unexpectedlyalters or reverses the ratio of endosperm/embryo expression from knownUbi-1 promoters in the seed of regenerated plants.

It is an object of this invention to provide recombinant promotermolecules that provide for reliably high levels of expression ofintroduced genes in target cells.

It is yet another object of this invention to provide plants, plantcells and plant tissues containing the recombinant promoter of theinvention.

It is yet another object of the invention to provide vehicles fortransformation of plant cells including viral or plasmid vectors andexpression cassettes incorporating the novel promoter sequences of theinvention.

It is yet another object of the invention to provide bacterial cellscomprising such vectors for maintenance, replication, and planttransformation.

Other objects of the invention will become apparent from the descriptionof the invention which follows.

SUMMARY OF THE INVENTION

The present invention comprises the design of novel regulatorynucleotide sequences which provide for improved expression of anucleotide sequence, such as a structural gene, in plants, bothmonocotyledonous and dicotyledonous. According to the invention, severalengineered versions of a maize ubiquitin promoter are described whichprovide for expression levels that are higher than that achieved withnative ubiquitin promoters and which spatially provide for alteredexpression levels in the embryo and endosperm of seed of regeneratedplants.

The invention further comprises expression cassettes comprising thepromoters of the invention, a structural gene, the expression of whichis desired in plant cells, and a polyadenylation or stop signal. Theexpression cassette can be encompassed in a plasmid or viral vector fortransformation of plant cells.

The invention also encompasses transformed bacterial cells formaintenance and replication of the vector, as well as transformedmonocot or dicot cells and ultimately transgenic plants, and breedingmaterials developed from the transgenic plants.

According to the invention, ubiquitin promoters are provided whichdiffer from prior ubiquitin promoters primarily in the area of the heatshock region which comprises overlapping heat shock elements, to removeone of the elements, to remove the overlap of the sequences, or todelete both elements entirely. In a preferred embodiment binding domainsfor transcription factors may be inserted in this area. The interactionbetween the overlapping heat shock elements and the intron region withthe rest of the 5′ sequence in the ubiquitin promoter is unknown and waspreviously thought to be critical for full promoter function. See Quail,supra. Applicants have found that the promoter not only still functionsadequately, despite prior teachings to the contrary but quitesurprisingly have discovered that engineering in this region increasesexpression over the previous ubiquitin promoter system and alters theexpression ratio of the protein from embryo to endosperm. The Ubi-1promoter, previously thought to be constitutive has recently been shownto express preferentially in the seed, WO 98/139461 published Sep. 11,1998, making the engineered promoters of the invention with endospermexpression surprising.

For purposes of this application the following terms shall have thedefinitions recited herein. Units, prefixes, and symbols may be denotedin their SI accepted form. Unless otherwise indicated, nucleic acids arewritten left to right in 5′ to 3′ orientation; amino acid sequences arewritten left to right in amino to carboxy orientation, respectively.Numeric ranges are inclusive of the numbers defining the range andinclude each integer within the defined range. Amino acids may bereferred to herein by either their commonly known three letter symbolsor by the one-letter symbols recommended by the IUPAC-IUB Biochemicalnomenclature Commission. Nucleotides, likewise, may be referred to bytheir commonly accepted single-letter codes. Unless otherwise providedfor, software, electrical, and electronics terms as used herein are asdefined in The New IEEE Standard Dictionary of Electrical andElectronics Terms (5^(th) edition, 1993). The terms defined below aremore fully defined by reference to the specification as a whole.

By “amplified” is meant the construction of multiple copies of a nucleicacid sequence or multiple copies complementary to the nucleic acidsequence using at least one of the nucleic acid sequences as a template.Amplification systems include the polymerase chain reaction (PCR)system, ligase chain reaction (LCR) system, nucleic acid sequence basedamplification (NASBA, Canteen, Mississauga, Ontario), Q-Beta Replicasesystems, transcription-based amplification system (TAS), and stranddisplacement amplification (SDA). See, e.g., Diagnostic MolecularMicrobiology: Principles and Applications, D. H. Persing et al., Ed.,American Society for Microbiology, Washington, D.C. (1993). The productof amplification is termed an amplicon.

As used herein, “antisense orientation” includes reference to a duplexpolynucleotide sequence that is operably linked to a promoter in anorientation where the antisense strand is transcribed. The antisensestrand is sufficiently complementary to an endogenous transcriptionproduct such that translation of the endogenous transcription product isoften inhibited.

As used herein, “chromosomal region” includes reference to a length of achromosome that may be measured by reference to the linear segment ofDNA that it comprises. The chromosomal region can be defined byreference to two unique DNA sequences, i.e., markers.

The term “conservatively engineered variants” applies to both amino acidand nucleic acid sequences. With respect to particular nucleic acidsequences, conservatively engineered variants refers to those nucleicacids which encode identical or conservatively engineered variants ofthe amino acid sequences. Because of the degeneracy of the genetic code,a large number of functionally identical nucleic acids encode any givenprotein. For instance, the codons GCA, GCC, GCG and GCU all encode theamino acid alanine. Thus, at every position where an alanine isspecified by a codon, the codon can be altered to any of thecorresponding codons described without altering the encoded polypeptide.Such nucleic acid variations are “silent variations” and represent onespecies of conservatively engineered variation. Every nucleic acidsequence herein that encodes a polypeptide also, by reference to thegenetic code, describes every possible silent variation of the nucleicacid. One of ordinary skill will recognize that each codon in a nucleicacid (except AUG, which is ordinarily the only codon for methionine; andUGG, which is ordinarily the only codon for tryptophan) can beengineered to yield a functionally identical molecule. Accordingly, eachsilent variation of a nucleic acid which encodes a polypeptide of thepresent invention is implicit in each described polypeptide sequence andis within the scope of the present invention.

As to amino acid sequences, one of skill will recognize that individualsubstitutions, deletions or additions to a nucleic acid, peptide,polypeptide, or protein sequence which alters, adds or deletes a singleamino acid or a small percentage of amino acids in the encoded sequenceis a “conservatively engineered variant” where the alteration results inthe substitution of an amino acid with a chemically similar amino acid.Thus, any number of amino acid residues selected from the group ofintegers consisting of from 1 to 15 can be so altered. Thus, forexample, 1, 2, 3, 4, 5, 7, or 10 alterations can be made. Conservativelyengineered variants typically provide similar biological activity as theunengineered polypeptide sequence from which they are derived. Forexample, substrate specificity, enzyme activity, or ligand/receptorbinding is generally at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% ofthe native protein for its native substrate. Conservative substitutiontables providing functionally similar amino acids are well known in theart.

The following six groups each contain amino acids that are conservativesubstitutions for one another:

-   -   1) Alanine (A), Serine (S), Threonine (T);    -   2) Aspartic acid (D), Glutamic acid (E);    -   3) Asparagine (N), Glutamine (Q);    -   4) Arginine (R), Lysine (K);    -   5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and    -   6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).        See also, Creighton (1984) Proteins W. H. Freeman and Company.

By “encoding” or “encoded”, with respect to a specified nucleic acid, ismeant comprising the information for translation into the specifiedprotein. A nucleic acid encoding a protein may comprise non-translatedsequences (e.g., introns) within translated regions of the nucleic acid,or may lack such intervening non-translated sequences (e.g., as incDNA). The information by which a protein is encoded is specified by theuse of codons. Typically, the amino acid sequence is encoded by thenucleic acid using the “universal” genetic code. However, variants ofthe universal code, such as are present in some plant, animal, andfungal mitochondria, the bacterium Mycoplasma capricolum, or the ciliateMacronucleus, may be used when the nucleic acid is expressed therein.

When the nucleic acid is prepared or altered synthetically, advantagecan be taken of known codon preferences of the intended host where thenucleic acid is to be expressed. For example, although nucleic acidsequences of the present invention may be expressed in bothmonocotyledonous and dicotyledonous plant species, sequences can beengineered to account for the specific codon preferences and GC contentpreferences of monocotyledons or dicotyledons as these preferences havebeen shown to differ (Murray et al. Nucl. Acids Res. 17:477-498 (1989)).Thus, the maize preferred codon for a particular amino acid may bederived from known gene sequences from maize. Maize codon usage for 28genes from maize plants are listed in Table 4 of Murray et al., supra.

As used herein “full-length sequence” in reference to a specifiedpolynucleotide or its encoded protein means having the entire amino acidsequence of, a native (non-synthetic), endogenous, biologically activeform of the specified protein. Methods to determine whether a sequenceis full-length are well known in the art including such exemplarytechniques as northern or western blots, primer extensions, S1protection, and ribonuclease protection. See, e.g., Plant MolecularBiology: A Laboratory Manual, Clark, Ed., Springer-Verlag, Berlin(1997). Comparison to known full-length homologous (orthologous and/orparalogous) sequences can also be used to identify full-length sequencesof the present invention. Additionally, consensus sequences typicallypresent at the 5′ and 3′ untranslated regions of mRNA aid in theidentification of a polynucleotide as full-length. For example, theconsensus sequence ANNNN AUGG, where the underlined codon represents theN-terminal methionine, aids in determining whether the polynucleotidehas a complete 5′ end. Consensus sequences at the 3 ′ end, such aspolyadenylation sequences, aid in determining whether the polynucleotidehas a complete 3′ end.

As used herein, “heterologous” in reference to a nucleic acid is anucleic acid that originates from a foreign species, or, if from thesame species, is substantially engineered from its native form incomposition and/or genomic locus by deliberate human intervention. Forexample, a promoter operably linked to a heterologous structural gene isfrom a species different from that from which the structural gene wasderived, or, if from the same species, one or both are substantiallyengineered from their original form. A heterologous protein mayoriginate from a foreign species or, if from the same species, issubstantially engineered from its original form by deliberate humanintervention.

By “host cell” is meant a cell which contains a vector and supports thereplication and/or expression of the vector. Host cells may beprokaryotic cells such as E. coli, or eukaryotic cells such as yeast,insect, amphibian, or mammalian cells. Preferably, host cells aremonocotyledonous or dicotyledonous plant cells. A particularly preferredmonocotyledonous host cell is a maize host cell.

The term “hybridization complex” includes reference to a duplex nucleicacid structure formed by two single-stranded nucleic acid sequencesselectively hybridized with each other.

The term “introduced” in the context of inserting a nucleic acid into acell, means “transfection” or “transformation” or “transduction” andincludes reference to the incorporation of a nucleic acid into aeukaryotic or prokaryotic cell where the nucleic acid may beincorporated into the genome of the cell (e.g., chromosome, plasmid,plastid or mitochondrial DNA), converted into an autonomous replicon, ortransiently expressed (e.g., transfected mRNA).

The term “isolated” refers to material, such as a nucleic acid or aprotein, which is: (1) substantially or essentially free from componentsthat normally accompany or interact with it as found in its naturallyoccurring environment. The isolated material optionally comprisesmaterial not found with the material in its natural environment; or (2)if the material is in its natural environment, the material has beensynthetically (non-naturally) altered by deliberate human interventionto a composition and/or placed at a location in the cell (e.g., genomeor subcellular organelle) not native to a material found in thatenvironment. The alteration to yield the synthetic material can beperformed on the material within or removed from its natural state. Forexample, a naturally occurring nucleic acid becomes an isolated nucleicacid if it is altered, or if it is transcribed from DNA which has beenaltered, by means of human intervention performed within the cell fromwhich it originates. See, e.g., Compounds and Methods for Site DirectedMutagenesis in Eukaryotic Cells, Kmiec, U.S. Pat. No. 5,565,350; In VivoHomologous Sequence Targeting in Eukaryotic Cells; Zarling et al.,PCT/US93/03868. Likewise, a naturally occurring nucleic acid (e.g., apromoter) becomes isolated if it is introduced by non-naturallyoccurring means to a locus of the genome not native to that nucleicacid. Nucleic acids which are “isolated” as defined herein, are alsoreferred to as “heterologous” nucleic acids.

As used herein, “localized within the chromosomal region defined by andincluding” with respect to particular markers includes reference to acontiguous length of a chromosome delimited by and including the statedmarkers.

As used herein, “marker” includes reference to a locus on a chromosomethat serves to identify a unique position on the chromosome. A“polymorphic marker” includes reference to a marker which appears inmultiple forms (alleles) such that different forms of the marker, whenthey are present in a homologous pair, allow transmission of each of thechromosomes of that pair to be followed. A genotype may be defined byuse of one or a plurality of markers.

As used herein, “nucleic acid” or “nucleotide” includes reference to adeoxyribonucleotide or ribonucleotide polymer in either single- ordouble-stranded form, and unless otherwise limited, encompasses knownanalogues having the essential nature of natural nucleotides in thatthey hybridize to single-stranded nucleic acids in a manner similar tonaturally occurring nucleotides (e.g., peptide nucleic acids).

By “nucleic acid library” is meant a collection of isolated DNA or RNAmolecules which comprise and substantially represent the entiretranscribed fraction of a genome of a specified organism. Constructionof exemplary nucleic acid libraries, such as genomic and cDNA libraries,is taught in standard molecular biology references such as Berger andKimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology,Vol. 152, Academic Press, Inc., San Diego, Calif. (Berger); Sambrook etal., Molecular Cloning—A Laboratory Manual, 2^(nd) ed., Vol. 1-3 (1989);and Current Protocols in Molecular Biology, F. M. Ausubel et al., Eds.,Current Protocols, a joint venture between Greene Publishing Associates,Inc. and John Wiley & Sons, Inc. (1994).

As used herein “operably linked” includes reference to a functionallinkage between a promoter and a second sequence, wherein the promotersequence initiates and mediates transcription of the DNA sequencecorresponding to the second sequence. Generally, operably linked meansthat the nucleic acid sequences being linked are contiguous and, wherenecessary to join two protein coding regions, contiguous and in the samereading frame.

As used herein, the term “plant” can include reference to whole plants,plant parts or organs (e.g., leaves, stems, roots, etc.), plant cells,seeds and progeny of same. Plant cell, as used herein, further includes,without limitation, cells obtained from or found in: seeds, suspensioncultures, embryos, meristematic regions, callus tissue, leaves, roots,shoots, gametophytes, sporophytes, pollen, and microspores. Plant cellscan also be understood to include engineered cells, such as protoplasts,obtained from the aforementioned tissues. The class of plants which canbe used in the methods of the invention is generally as broad as theclass of higher plants amenable to transformation techniques, includingboth monocotyledonous and dicotyledonous plants. Particularly preferredplants include maize, soybean, sunflower, sorghum, canola, wheat,alfalfa, cotton, rice, barley, and millet.

As used herein, “polynucleotide” includes reference to adeoxyribopolynucleotide, ribopolynucleotide, or analogs thereof thathave the essential nature of a natural ribonucleotide in that theyhybridize, under stringent hybridization conditions, to substantiallythe same nucleotide sequence as naturally occurring nucleotides and/orallow translation into the same amino acid(s) as the naturally occurringnucleotide(s). A polynucleotide can be full-length or a subsequence of anative or heterologous structural or regulatory gene. Unless otherwiseindicated, the term includes reference to the specified sequence as wellas the complementary sequence thereof. Thus, DNAs or RNAs with backbonesengineered for stability or for other reasons as “polynucleotides” asthat term is intended herein. Moreover, DNAs or RNAs comprising unusualbases, such as inosine, or engineered bases, such as tritylated bases,to name just two examples, are polynucleotides as the term is usedherein. It will be appreciated that a great variety of engineering hasbeen made to DNA and RNA that serve many useful purposes known to thoseof skill in the art. The term polynucleotide as it is employed hereinembraces such chemically, enzymatically or metabolically engineeredforms of polynucleotides, as well as the chemical forms of DNA and RNAcharacteristic of viruses and cells, including among other things,simple and complex cells.

The terms “polypeptide”, “peptide” and “protein” are usedinterchangeably herein to refer to a polymer of amino acid residues. Theterms apply to amino acid polymers in which one or more amino acidresidue is an artificial chemical analogue of a corresponding naturallyoccurring amino acid, as well as to naturally occurring amino acidpolymers. The essential nature of such analogues of naturally occurringamino acids is that, when incorporated into a protein, that protein isspecifically reactive to antibodies elicited to the same protein butconsisting entirely of naturally occurring amino acids. The terms“polypeptide”, “peptide” and “protein” are also inclusive of engineeringincluding, but not limited to, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation. It will be appreciated, as is well known and asnoted above, that polypeptides are not entirely linear. For instance,polypeptides may be branched as a result of ubiquitination, and they maybe circular, with or without branching, generally as a result ofposttranslation events, including natural processing event and eventsbrought about by human manipulation which do not occur naturally.Circular, branched and branched circular polypeptides may be synthesizedby non-translation natural process and by entirely synthetic methods, aswell. Further, this invention contemplates the use of both themethionine-containing and the methionine-less amino terminal variants ofthe protein of the invention.

As used herein “promoter” includes reference to a region of DNA upstreamfrom the start of transcription and involved in recognition and bindingof RNA polymerase and other proteins to initiate transcription. A “plantpromoter” is a promoter capable of initiating transcription in plantcells whether or not its origin is a plant cell. Exemplary plantpromoters include, but are not limited to, those that are obtained fromplants, plant viruses, and bacteria such as Agrobacterium or Rhizobiumwhich comprise genes expressed in plant cells. Examples of promotersunder developmental control include promoters that preferentiallyinitiate transcription in certain tissues, such as leaves, roots, orseeds. Such promoters are referred to as “tissue preferred”. Promoterswhich initiate transcription only in certain tissue are referred to as“tissue specific”. A “cell type” specific promoter primarily drivesexpression in certain cell types in one or more organs, for example,vascular cells in roots or leaves. An “inducible” or “repressible”promoter is a promoter which is under environmental control. Examples ofenvironmental conditions that may effect transcription by induciblepromoters include anaerobic conditions or the presence of light. Tissuespecific, tissue preferred, cell type specific, and inducible promotersconstitute the class of “non-constitutive” promoters. A “constitutive”promoter is a promoter which is active under most environmentalconditions, and in most plant parts.

As used herein “recombinant” includes reference to a cell or vector,that has been engineered by the introduction of a heterologous nucleicacid or that the cell is derived from a cell so engineered. Thus, forexample, recombinant cells express genes that are not found in identicalform within the native (non-recombinant) form of the cell or expressnative genes that are otherwise abnormally expressed, under-expressed ornot expressed at all as a result of deliberate human intervention. Theterm “recombinant” as used herein does not encompass the alteration ofthe cell or vector by naturally occurring events (e.g., spontaneousmutation, natural transformation/transduction/transposition) such asthose occurring without deliberate human intervention.

As used herein, a “expression cassette” is a nucleic acid construct,generated recombinantly or synthetically, with a series of specifiednucleic acid elements which permit transcription of a particular nucleicacid in a host cell. The recombinant expression cassette can beincorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA,virus, or nucleic acid fragment. Typically, the recombinant expressioncassette portion of an expression vector includes, among othersequences, a nucleic acid to be transcribed, and a promoter.

The term “residue” or “amino acid residue” or “amino acid” are usedinterchangeably herein to refer to an amino acid that is incorporatedinto a protein, polypeptide, or peptide (collectively “protein”). Theamino acid may be a naturally occurring amino acid and, unless otherwiselimited, may encompass non-natural analogs of natural amino acids thatcan function in a similar manner as naturally occurring amino acids.

The term “selectively hybridizes” includes reference to hybridization,under stringent hybridization conditions, of a nucleic acid sequence toa specified nucleic acid target sequence to a detectably greater degree(e.g., at least 2-fold over background) than its hybridization tonon-target nucleic acid sequences and to the substantial exclusion ofnon-target nucleic acids. Selectively hybridizing sequences typicallyhave about at least 80% sequence identity, preferably 90% sequenceidentity, and most preferably 100% sequence identity (i.e.,complementary) with each other.

The term “stringent conditions” or “stringent hybridization conditions”includes reference to conditions under which a probe will hybridize toits target sequence, to a detectably greater degree than to othersequences (e.g., at least 2-fold over background). Stringent conditionsare sequence-dependent and may be different in different circumstances.By controlling the stringency of the hybridization and/or washingconditions, target sequences can be identified which are 100%complementary to the probe (homologous probing). Alternatively,stringency conditions can be adjusted to allow some mismatching insequences so that lower degrees of similarity are detected (heterologousprobing). Generally, a probe is less than about 1000 nucleotides inlength, optionally less than 500 nucleotides in length.

Typically, stringent conditions will be those in which the saltconcentration is less than about 1.5 M Na ion, typically about 0.01 to1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and thetemperature is at least about 30° C. for short probes (e.g., 10 to 50nucleotides) and at least about 60° C. for long probes (e.g., greaterthan 50 nucleotides). Stringent conditions may also be achieved with theaddition of destabilizing agents such as formamide. Exemplary lowstringency conditions include hybridization with a buffer solution of 30to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37° C.,and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at50 to 55° C. Exemplary moderate stringency conditions includehybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37° C., and awash in 0.5× to 1×SSC at 55 to 50° C. Exemplary high stringencyconditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at37° C., and a wash in 0.1×SSC at 60 to 65° C.

Specificity is typically the function of post-hybridization washes, thecritical factors being the ionic strength and temperature of the finalwash solution. For DNA-DNA hybrids, the T_(m) can be approximated fromthe equation of Meinkoth and Wahl, Anal. Biochem., 138:267-284 (1984):T_(m)=81.5° C.+16.6 (log M)+0.41 (% GC)−0.61 (% form)−500/L; where M isthe molarity of monovalent cations, % GC is the percentage of guanosineand cytosine nucleotides in the DNA, % form is the percentage offormamide in the hybridization solution, and L is the length of thehybrid in base pairs. The T_(m) is the temperature (under defined ionicstrength and pH) at which 50% of the complementary target sequencehybridizes to a perfectly matched probe. T_(m) is reduced by about 1° C.for each 1% of mismatching; thus, T_(m), hybridization and/or washconditions can be adjusted to hybridize to sequences of the desiredidentity. For example, if sequences with ≦90% identity are sought, theT_(m) can be decreased 10° C. Generally, stringent conditions areselected to be about 5° C. lower than the thermal melting point (T_(m))for the specific sequence and its complement at a defined ionic strengthand pH. However, severely stringent conditions can utilize ahybridization and/or wash at 1, 2, 3, or 4° C. lower than the thermalmelting point (T_(m)); moderately stringent conditions can utilize ahybridization and/or wash at 6, 7, 8, 9, or 10° C. lower than thethermal melting point (T_(m)); low stringency conditions can utilize ahybridization and/or wash at 11, 12, 13, 14, 15, or 20° C. lower thanthe thermal melting point (T_(m)). Using the equation, hybridization andwash compositions, and desired T_(m), those of ordinary skill willunderstand that variations in the stringency of hybridization and/orwash solutions are inherently described. If the desired degree ofmismatching results in a T_(m) of less than 45° C. (aqueous solution) or32° C. (formamide solution) it is preferred to increase the SSCconcentration so that a higher temperature can be used. An extensiveguide to the hybridization of nucleic acids is found in Tijssen,Laboratory Techniques in Biochemistry and MolecularBiology—Hybridization with Nucleic Acids Probes, Part I, Chapter 2,Ausubel, et al., Eds., Greene Publishing and Wiley-Interscience, N.Y.(1995).

As used herein, the term “structural gene” includes any nucleotidesequence the expression of which is desired in a plant cell. Astructural gene can include an entire sequence encoding a protein, anopen reading frame or any portion thereof or also antisense. Examples ofstructural genes are included hereinafter are intended for illustrationand not limitation.

As used herein, “transgenic plant” includes reference to a plant whichcomprises within its genome a heterologous polynucleotide. Generally,the heterologous polynucleotide is stably integrated within the genomesuch that the polynucleotide is passed on to successive generations. Theheterologous polynucleotide may be integrated into the genome alone oras part of a recombinant expression cassette. “Transgenic” is usedherein to include any cell, cell line, callus, tissue, plant part orplant, the genotype of which has been altered by the presence ofheterologous nucleic acid including those transgenics initially soaltered as well as those created by sexual crosses or asexualpropagation from the initial transgenic. The term “transgenic” as usedherein does not encompass the alteration of the genome (chromosomal orextra-chromosomal) by conventional plant breeding methods or bynaturally occurring events such as random cross-fertilization,non-recombinant viral infection, non-recombinant bacterialtransformation, non-recombinant transposition, or spontaneous mutation.

As used herein, “vector” includes reference to a nucleic acid used intransfection of a host cell and into which can be inserted apolynucleotide. Vectors are often bacterial plasmids or replicons.Expression vectors permit transcription of a nucleic acid insertedtherein.

The following terms are used to describe the sequence relationshipsbetween two or more nucleic acids or polynucleotides: (a) “referencesequence”, (b) “comparison window”, (c) “sequence identity”, (d)“percentage of sequence identity”, and (e) “substantial identity”.

(a) As used herein, “reference sequence” is a defined sequence used as abasis for sequence comparison. A reference sequence may be a subset orthe entirety of a specified sequence; for example, as a segment of afull-length cDNA or gene sequence, or the complete cDNA or genesequence. (b) As used herein, “comparison window” includes reference toa contiguous and specified segment of a polynucleotide sequence, whereinthe polynucleotide sequence may be compared to a reference sequence andwherein the portion of the polynucleotide sequence in the comparisonwindow may comprise additions or deletions (i.e., gaps) compared to thereference sequence (which does not comprise additions or deletions) foroptimal alignment of the two sequences. Generally, the comparison windowis at least 20 contiguous nucleotides in length, and optionally can be30, 40, 50, 100, or longer. Those of skill in the art understand that toavoid a high similarity to a reference sequence due to inclusion of gapsin the polynucleotide sequence, a gap penalty is typically introducedand is subtracted from the number of matches.

Methods of alignment of sequences for comparison are well-known in theart. Optimal alignment of sequences for comparison may be conducted bythe local homology algorithm of Smith and Waterman, Adv. Appl. Math.2:482 (1981); by the homology alignment algorithm of Needleman andWunsch, J. Mol. Biol. 48:443 (1970); by the search for similarity methodof Pearson and Lipman, Proc. Natl. Acad. Sci. 85:2444 (1988); bycomputerized implementations of these algorithms, including, but notlimited to: CLUSTAL in the PC/Gene program by Intelligenetics, MountainView, Calif.; GAP, BESTFIT, BLAST, FASTA, and TFASTA in the WisconsinGenetics Software Package, Genetics Computer Group (GCG), 575 ScienceDr., Madison, Wis., USA; the CLUSTAL program is well described byHiggins and Sharp, Gene 73:237-244 (1988); Higgins and Sharp, CABIOS5:151-153 (1989); Corpet, et al., Nucleic Acids Research 16:10881-90(1988); Huang, et al., Computer Applications in the Biosciences 8:155-65(1992), and Pearson, et al., Methods in Molecular Biology 24:307-331(1994). The BLAST family of programs which can be used for databasesimilarity searches includes: BLASTN for nucleotide query sequencesagainst nucleotide database sequences; BLASTX for nucleotide querysequences against protein database sequences; BLASTP for protein querysequences against protein database sequences; TBLASTN for protein querysequences against nucleotide database sequences; and TBLASTX fornucleotide query sequences against nucleotide database sequences. See,Current Protocols in Molecular Biology, Chapter 19, Ausubel, et al.,Eds., Greene Publishing and Wiley-Interscience, New York (1995).

Unless otherwise stated, sequence identity/similarity values providedherein refer to the value obtained using the BLAST 2.0 suite of programsusing default parameters. Altschul eta., Nucleic Acids Res. 25:3389-3402(1997). Software for performing BLAST analyses is publicly available,e.g., through the National Center for Biotechnology-Information (worldwide web at hcbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identiying short wordsof length W in the query sequence, which either match or satisfy somepositive-valued threshold score T when aligned with a word of the samelength in a database sequence. T is referred to as the neighborhood wordscore threshold (Altachul et al., supra). These initial neighborhoodword hits act as seeds for initiating searches to find longer HSPscontaining them. The word hits are then extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Cumulative scores are calculated using, for nucleotidesequences, the parameters M (reward score for a pair of matchingresidues; always >0) and N (penalty score for mismatching residues;always <0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls off bythe quantity X from its maximum achieved value; the cumulative scoregoes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X detennine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=4, and a comparison ofboth sirands. For amino acid sequences, the BLASTP program uses asdefaults a wordlength (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl.Acad. Sci. USA 89:10915).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA90:5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance.

BLAST searches assume that proteins can be modeled as random sequences.However, many real proteins comprise regions of nonrandom sequenceswhich may be homopolymeric tracts, short-period repeats, or regionsenriched in one or more amino acids. Such low-complexity regions may bealigned between unrelated proteins even though other regions of theprotein are entirely dissimilar. A number of low-complexity filterprograms can be employed to reduce such low-complexity alignments. Forexample, the SEG (Wooten and Federhen, Comput. Chem., 17:149-163 (1993))and XNU (Claverie and States, Comput. Chem., 17:191-201 (1993))low-complexity filters can be employed alone or in combination.

(c) As used herein, “sequence identity” or “identity” in the context oftwo nucleic acid or polypeptide sequences includes reference to theresidues in the two sequences which are the same when aligned formaximum correspondence over a specified comparison window. Whenpercentage of sequence identity is used in reference to proteins it isrecognized that residue positions which are not identical often differby conservative amino acid substitutions, where amino acid residues aresubstituted for other amino acid residues with similar chemicalproperties (e.g. charge or hydrophobicity) and therefore do not changethe functional properties of the molecule. Where sequences differ inconservative substitutions, the percent sequence identity may beadjusted upwards to correct for the conservative nature of thesubstitution. Sequences which differ by such conservative substitutionsare said to have “sequence similarity” or “similarity”. Means for makingthis adjustment are well-known to those of skill in the art. Typicallythis involves scoring a conservative substitution as a partial ratherthan a full mismatch, thereby increasing the percentage sequenceidentity. Thus, for example, where an identical amino acid is given ascore of 1 and a non-conservative substitution is given a score of zero,a conservative substitution is given a score between zero and 1. Thescoring of conservative substitutions is calculated, e.g., according tothe algorithm of Meyers and Miller, Computer Applic. Biol. Sci., 4:11-17(1988) e.g., as implemented in the program PC/GENE (Intelligenetics,Mountain View, Calif., USA).

(d) As used herein, “percentage of sequence identity” means the valuedetermined by comparing two optimally aligned sequences over acomparison window, wherein the portion of the polynucleotide sequence inthe comparison window may comprise additions or deletions (i.e., gaps)as compared to the reference sequence (which does not comprise additionsor deletions) for optimal alignment of the two sequences. The percentageis calculated by determining the number of positions at which theidentical nucleic acid base or amino acid residue occurs in bothsequences to yield the number of matched positions, dividing the numberof matched positions by the total number of positions in the window ofcomparison and multiplying the result by 100 to yield the percentage ofsequence identity.

(e)(I) The term “substantial identity” of polynucleotide sequences meansthat a polynucleotide comprises a sequence that has at least 70%sequence identity, preferably at least 80%, more preferably at least 90%and most preferably at least 95%, compared to a reference sequence usingone of the alignment programs described using standard parameters. Oneof skill will recognize that these values can be appropriately adjustedto determine corresponding identity of proteins encoded by twonucleotide sequences by taking into account codon degeneracy, amino acidsimilarity, reading frame positioning and the like. Substantial identityof amino acid sequences for these purposes normally means sequenceidentity of at least 60%, or preferably at least 70%, 80%, 90%, and mostpreferably at least 95%.

Another indication that nucleotide sequences are substantially identicalis if two molecules hybridize to each other under stringent conditions.However, nucleic acids which do not hybridize to each other understringent conditions are still substantially identical if thepolypeptides which they encode are substantially identical. This mayoccur, e.g., when a copy of a nucleic acid is created using the maximumcodon degeneracy permitted by the genetic code. One indication that twonucleic acid sequences are substantially identical is that thepolypeptide which the first nucleic acid encodes is immunologicallycross reactive with the polypeptide encoded by the second nucleic acid.

(e) (ii) The terms “substantial Identity” in the context of a peptideindicates that a peptide comprises a sequence with at least 70% sequenceidentity to a reference sequence, preferably 80%, or preferably 85%,most preferably at least 90% or 95% sequence identity to the referencesequence over a specified comparison window. Optionally, optimalalignment is conducted using the homology alignment algorithm ofNeedleman and Wunsch, J. Mol. Biol. 48:443 (1970). an indication thattwo peptide sequences are substantially identical is that one peptide isimmunologically reactive with antibodies raised against the secondpeptide. Thus, a peptide is substantially identical to a second peptide,for example, where the two peptides differ only by a conservativesubstitution. Peptides which are “substantially similar” share sequencesas noted above except that residue positions which are not identical maydiffer by conservative amino acid changes.

As used herein, the term “maize ubiquitin promoter”, or “ubiquitinpromoter”, or “ubiquitin-1 promoter” or “Ubi-1 promoter” shall include a5′ promoter region from a gene encoding ubiquitin, or protein with thefunctional characteristics of ubiquitin, and shall include the 5′ regionof the maize ubiquitin gene described in Quail, bases −899-1092including sequences which are capable of hybridizing under conditions ofhigh stringency thereto.

As used herein the term “engineered ubiquitin promoter” or “Ubi-1promoter variant” shall include a ubiquitin promoter which has a heatshock region that is engineered from its native state and which iscapable of directing expression in a plant cell.

As used herein the term “heat shock region” shall include an area of aubiquitin promoter sequence which comprises two overlapping heat shockelements and includes bases −214 to −189 of the sequence disclosed inQuail.

DESCRIPTION OF THE FIGURES

FIGS. 1A, 1B and 1C are graphs showing expression of GUS driven by Ubi-1engineered promoter variants in tissues derived from independent stabletransformation events. In FIG. 1A, embryogenic callus tissue isdepicted. A mean level of GUS was determined among transformation eventsfor each Ubi-1 variant. (GSB=wild type, yellow; GSC=HSEs deleted, GSD=3′HSE deleted, GSE=5′ HSE deleted, GSF HSE adjacent, GSG =HSEs replaced byPs1 trimer.

In FIG. 1B, leaf tissue of seedlings regenerated from tissue culture aredepicted. A mean level of GUS was determined among one to eight plantsderived from independent transformation events. From these data, a meanlevel of GUS was determined for each Ubi-1 variant. No data wereavailable for GSD.

In FIG. 1C, T1 seed is depicted. The highest level of GUS was determinedamong five seeds for each T0 plant. A mean level of GUS was thendetermined for this high expressing seed among one to ten T0 plants foreach independent transformation event. From these data, a mean level ofGUS was determined for each Ubi-1 promoter variant. For FIGS. 1A, B andC, tissue with no GUS activity was not included in the analysis. Thenumber of transformation events (n) per DNA construct is shown. 95%confidence levels are shown for the mean values. Note the difference inthe y-axis scale for A, B and C.

FIG. 2 is a graph depicting expression of GUS driven by Ubi-1 promotervariants in T1 seed of stable transformed lines. The highest level ofGUS was determined among five seeds for each T0 plant. A mean level ofGUS was then determined for this high expressing seed among one to tenT0 plants for each independent transformation event. From these data, ahighest recorded (light bar) and a mean (dark bar) level of GUS weredetermined for each Ubi-1 variant. Plants that produced no seed with GUSactivity were not included in the analysis. The number of transformationevents (n) per DNA construct is shown. 95% confidence levels are shownfor the mean values.

FIGS. 3A and 3B are graphs depicting expression of GUS in T1 seed and inleaves of T1 plants.

In FIG. 3A, the highest level of GUS was determined among five seeds foreach T0 plant, and then these data were used to select the highestexpressing seed pools derived from several of the independenttransformation events. A highest recorded (light bar) and a mean (darkbar) level of GUS were then determined among the selected seed pools foreach promoter variant.

In FIG. 3B, leaves were analyzed from three herbicide resistant T1plants derived from each selected T1 seed pool, and the highest observedlevel of GUS in leaf tissue was recorded for each pool. From these data,a highest recorded (light bar) and a mean (dark bar) level of GUS weredetermined among the selected leaf tissue for each promoter variant. ForFIGS. 3A and B, the number transformation events chosen (n) per DNAconstruct is shown. 95% confidence levels are shown for the mean values.Note the difference in the y-axis scale for A and B.

DETAILED DESCRIPTION OF THE INVENTION

The maize ubiquitin promoter has often been used to drive relativelyhigh level expression of foreign genes in monocots, particularlyeconomically important grasses (Cornejo, M. J., et al. (1993), “Activityof a maize ubiquitin promoter in transgenic rice”, Plant Mol. Biol.23:567-581) Examples of genes expressed from this promoter includebar/pat for herbicide selection, uidA for GUS reporter gene expressionto score transformation and recently for xenogenic protein production inmaize (Hood, E. E., et al. (1997), “Commercial production of avidin fromtransgenic maize: characterization of transformant, production,processing, extraction and purification”, Mol. Breed. 3:291-306;Witcher, D. R., et al. (1998), “Commercial production of β-glucuronidase(GUS): a model system for the production of proteins in plants”, Mol.Breed 4:301-312; Zhong, G-Y, et al. (1999), “Commercial production ofaprotinin in transgenic maize seeds”, Mol. Breed, 5:345-356).

Maize Ubi-1 is one of the highest expressed constitutive genescharacterized in plants (Christensen et al., 1992, “Maize polyubiquitingenes: structure, thermal perturbation of expression and transcriptsplicing, and promoter activity following transfer to protoplasts byelectroporation”, Plant Mol. Biol. 18:675-689). Approximately 0.9 kb of5′ flanking sequence of Ubi-1, together with the 5′ untranslated leadersequence and the first intron, are sufficient to drive expression ofreporter genes in several monocot species (Christensen et al., 1992,supra; “Non-systemic expression of a stress-responsive maizepolyubiquitin gene (Ubi-1) in transgenic rice plants”, 1994, Takimoto etal., Plant Mol. Biol. 26:1007-1012; Christensen and Quail, 1996,“Ubiquitin promoter-based vectors for high-level expression ofselectable and/or screenable marker genes in monocotyledonous plants”,Transgenic Res. 5:213-218). The 5′ flanking sequence of Ubi-1 includesregions with similarity to defined cis-acting elements. A TATA box islocated in the consensus position, and two overlapping heat shockelements (HSEs) with similarity to the HSEs of Drosophila melanogastergenes (Pelham, H. R., et al. (1982), “A synthetic heat-shock promoterelement confers heat-inducibility on the herpes simplex virus thymidinekinase gene”, EMBO J., 1:1473-1477) are located approximately 0.2 kbupstream of the transcription start site.

In an effort to develop constitutive promoters which effect even higherlevels of foreign gene expression in callus, leaves or seeds of grassspecies, the Applicants have developed promoters which have differentcontrolling elements than the native maize polyubiquitin-1 promoters.Engineering was focused on the overlapping heat shock elements (HSEs)˜200 bases 5′ to the start of transcription. These elements were removedentirely, singly removed or placed in tandem as opposed to their nativeoverlapping arrangement. A final variant contained a seed preferredbinding domain in place of the native elements. Three of the fivepromoter variants effected higher level expression of GUS reporterprotein in seed, and two of these were more effective in leaves than thewild type maize ubiquitin promoter. The new promoters are surprising asit was previously thought that two heat shock elements need be presentand further that these elements were overlapping for functional promoteractivity.

Quite surprisingly these novel promoters changed the tissue preferencefor expression, from primarily embryo expression to increased expressionin the endosperm with decreased embryo expression. One of the variantscompletely reversed the ratio of embryo to endosperm expressionresulting in an endosperm preferred expression profile.

According to the invention novel promoters have been designed whichinclude ubiquitin promoter variants with engineering primarily of theheat shock region at −214-190 of the ubiquitin promoter.

Typically this region is comprised of two overlapping heat shockelements having the following sequence:CTGGACCCC T CTCGA GAGTTCCGCT  (SEQ ID NO:1)

The 5′ heat shock consensus sequence is underlined. The 3′ heat shockconsensus sequence is overlined. As can be seen, the overlap is a CTCGA5-mer. According to the invention, novel promoters are designed which donot include two overlapping heat shock elements. Variants included,deletion of both heat shock elements, deletion of the 3′ element,deletion of the 5′ element, and removal of the overlap so that the twoelements are adjacent.

A chart depicting the engineering in the heat shock region is below:

TABLE 1 Engineering of Ubi-1 promoter HSE DNA HSE Trans- con- engineer-genic struct DNA sequence¹ ing lines PGN7062 CTGGACCCCTCTCGAGAGTTCCGCTwild type GSB (SEQ ID NO: 1) PGN7547 ------------------------- HSEs GSCdeleted PGN7565 CTGGACCCC TCTCGA---------- 3′HSE GSD (SEQ ID NO: 2)deleted PGN7583 ---------- CTCGAGAGTTCCGCT 5′HSE GSE (SEQ ID NO: 3)deleted PGN7600 CTGGACCCCTCTCGA CTCGAGAGTTC HSEs GSF CGCT (SEQ ID NO: 4)adjacent PGN8926 3x(GACACGTAGAATGAGTCATCAC) HSEs GSG (SEQ ID NO: 5)replaced by Ps1 trimer ¹The 5′ HSE is in bold type and the 3′ HSE isunderlined.

In yet another embodiment a transcription binding factor can be added inthe engineered heat shock element region, to add in transcription of thesequences following the promoter. Such factors are known to those ofskill in the art and include but are not limited to: the prolactin seedspecific binding factor: (dePater, S., et al. (1994), “A 22-bp fragmentof the pea lectin promoter containing essential TGAC-like motifs confersseed-specific gene expression”, Plant Cell 5:877-886dePater, S., et al.(1996), “The 22 bp W1 element in the pea lectin promoter is necessaryand, as a multimer, sufficient for high gene expression in tobaccoseeds”, Plant Mol. Biol. 32:515-523), and the basic domain/leucinezipper proteins TGA1a and Opaque-2 can bind this sequence in vitro(dePater, S., et al. (1994), “bZIP proteins bind to a palindromicsequence without and ACGT core located in a seed-specific element of thepea lectin promoter”, Plant J. 6:133-140). A table of transcriptionfactors which may be used according to the invention follows:

TABLE A 5′ 3′ extent extent Species Factor Target Gene of site of siteSite Sequence Arabidopsis EBP Pathogenesis-related −207 −192 atGGCTctta(SEQ ID NO: 6) thaliana protein 1b Arabidopsis HY5 Ribulose-1, −241 −230CTTCCACGTGGCA thaliana 5-biphosphate (SEQ ID NO: 7) carboxylase HordeumBLZ-1 B-hordein −252 −220 acatgtaaagtgaataagGTGAGTCA vulgare (SEQ ID NO:8) Hordeum Gamyb High-pI −149 −128 ggccgaTAACAAACtccggccg vulgarealpha-amylas (SEQ ID NO: 9) an Oryza RF2a Rice tungro −53 −39CCAGTGTGCCCCTGG sativa virus bacilliform virus (SEQ ID NO: 10) promoterPhaseolus ROM1 Phytohemagglutinin −207 −199 GCCACGTCA vulgare Pisum GT-1Ribulose-1, −257 −245 GATTTACACT (SEQ ID NO: 11) sativum 5-biphosphatecarboxylase Triticum SPA Low molecular weight −256 −241 taaGGTGAGTCATataaestivum glutenin-1D1 (SEQ ID NO: 12) Zea mays Dof2 C4-type −774 −765ATACTTTTC (SEQ ID NO: 13) phosphoenolpyruvate carboxylase Zea maysOpaque-2 22-kD Zein −305 −288 tgTCATTCCACGTAGAtg (SEQ ID NO: 14)Transgenic Techniques Overview

Likewise, by means of the present invention, agronomic genes incombination with the promoters of the invention can be expressed intransformed plants. Production of a genetically engineered plant tissueeither expressing or inhibiting expression of a structural gene combinesthe teachings of the present disclosure with a variety of techniques andexpedients known in the art. In most instances, alternate expedientsexist for each stage of the overall process. The choice of expedientsdepends on the variables such as the plasmid vector system chosen forthe cloning and introduction of the recombinant DNA molecule, the plantspecies to be engineered, the particular structural gene, promoterelements and upstream elements used. Persons skilled in the art are ableto select and use appropriate alternatives to achieve functionality.Culture conditions for expressing desired structural genes and culturedcells are known in the art. Also as known in the art, a number of bothmonocotyledonous and dicotyledonous plant species are transformable andregenerable such that whole plants containing and expressing desiredgenes under regulatory control of the promoter molecules according tothe invention may be obtained. As is known to those of skill in the art,expression in transformed plants may be tissue specific and/or specificto certain developmental stages. Truncated promoter selection andstructural gene selection are other parameters which may be optimized toachieve desired plant expression or inhibition as is known to those ofskill in the art and taught herein.

The following is a non-limiting general overview of Molecular biologytechniques which may be used in performing the methods of the invention.

Structural Gene

Likewise, by means of the present invention, heterologous nucleotidesequences can be expressed in transformed plants. More particularly,plants can be genetically engineered to express various phenotypes ofagronomic interest.

Exemplary genes include but are not limited to: plant disease resistancegenes, (Martin et al., Science 262: 1432 (1993) (tomato Pto gene forresistance to Pseudomonas syringae pv. tomato encodes a proteinkinase)); a Bacillus thuringiensis protein, (Geiser et al., Gene 48: 109(1986); a lectin, (Van Damme et al., Plant Molec. Biol. 24: 25 (1994));a vitamin-binding protein, (such as avidin. see PCT applicationUS93/06487); an enzyme inhibitor, (Abe et al., J. Biol. Chem. 262: 16793(1987)); an insect-specific hormone or pheromone, (see, for example,Hammock et al., Nature 344: 458 (1990)); an insect-specific peptide orneuropeptide, (Regan, J. Biol. Chem. 269: 9 (1994)); an insect-specificvenom, (Pang et al., Gene 116: 165 (1992); an enzyme responsible for anhyperaccumulation of a monterpene; an enzyme involved in theengineering, including the post-translational engineering, of abiologically active molecule; for example, a glycolytic enzyme, aproteolytic enzyme; (See PCT application WO 93/02197); a molecule thatstimulates signal transduction, (for example, Botella et al., PlantMolec. Biol. 24: 757 (1994)); a hydrophobic moment peptide, (PCTapplication WO 95/16776); a membrane permease, (Jaynes et al., PlantSci. 89: 43 (1993)); a viral-invasive protein or a complex toxin derivedtherefrom, (Beachy et al., Ann. Rev. Phytopathol.28: 451 (1990));(Taylor et al., Abstract #497, SEVENTH IN'L SYMPOSIUM ON MOLECULARPLANT-MICROBE INTERACTIONS (Edinburgh, Scotland, 1994)); avirus-specific antibody, (Tavladoraki et al., Nature 366: 469 (1993)); adevelopmental-arrestive protein produced in nature by a pathogen or aparasite, (Lamb et al., Bio/Technology 10: 1436 (1992)); adevelopmental-arrestive protein produced in nature by a plant, (Logemannet al., Bio/Technology 10: 305 (1992)); a herbicide that inhibits thegrowing point or meristem, such as an imidazalinone or a sulfonylurea,(Lee et al.,EMBO J. 7: 1241 (1988)); Glyphosate (resistance imparted bymutant 5-enolpyruvl-3-phosphikimate synthase (EPSP) and aroA genes,respectively) (U.S. Pat. No. 4,940,835); a herbicide that inhibitsphotosynthesis, such as a triazine (psbA and gs+ genes) and abenzonitrile (nitrilase gene). (Przibilla et al., Plant Cell 3: 169(1991)); Engineered fatty acid metabolism, for example, by transforminga plant with an antisense gene of stearoyl-ACP desaturase to increasestearic acid content of the plant. See Knultzon et al., Proc. Natl.Acad. Sci. USA 89: 2624 (1992); decreased phytate content, (VanHartingsveldt et al., Gene 127: 87 (1993)); engineered carbohydratecomposition, for example, by transforming plants with a gene coding foran enzyme that alters the branching pattern of starch. (See Shiroza etal., J. Bacteriol. 170: 810 (1988)); genes that controls cellproliferation and growth of the embryo and/or endosperm such as cellcycle regulators (Bogre L et al., “Regulation of cell division and thecytoskeleton by mitogen-activated protein kinases in higher plants.”Results Probl Cell Differ 27:95-117 (2000).

Promoters

The promoters disclosed herein may be used in conjunction with naturallyoccurring flanking coding or transcribed sequences of the desiredheterologous nucleotide sequence or structural gene or with any othercoding or transcribed sequence that is critical to structural geneformation and/or function.

It may also be desirable to include some intron sequences in thepromoter constructs since the inclusion of intron sequences in thecoding region may result in enhanced expression and specificity. Thus,it may be advantageous to join the DNA sequences to be expressed to apromoter sequence that contains the first intron and exon sequences of apolypeptide which is unique to cells/tissues of a plant critical to seedspecific Structural formation and/or function.

Additionally, regions of one promoter may be joined to regions from adifferent promoter in order to obtain the desired promoter activityresulting in a chimeric promoter. Synthetic promoters which regulategene expression may also be used.

The expression system may be further optimized by employing supplementalelements such as transcription terminators and/or enhancer elements.

Other Regulatory Elements

In addition to a promoter sequence, an expression cassette or constructshould also contain a transcription termination region downstream of thestructural gene to provide for efficient termination. The terminationregion or polyadenylation signal may be obtained from the same gene asthe promoter sequence or may be obtained from different gene.Polyadenylation sequences include, but are not limited to theAgrobacterium octopine synthase signal (Gielen et al., EMBO J. (1984)3:835-846) or the nopaline synthase signal (Depicker et al., Mol. andAppl. Genet. (1982) 1:561-573), or pin II the proteinase inhibitor IIgene from potato.

Marker Genes

Recombinant DNA molecules containing any of the DNA sequences andpromoters described herein may additionally contain selection markergenes which encode a selection gene product which confer on a plant cellresistance to a chemical agent or physiological stress, or confer adistinguishable phenotypic characteristic to the cells such that plantcells transformed with the recombinant DNA molecule may be easilyselected using a selective agent. One such selection marker gene isneomycin phosphotransferase (NPT II) which confers resistance tokanamycin and the antibiotic G-418. Cells transformed with thisselection marker gene may be selected for by assaying for the presencein vitro of phosphorylation of kanamycin using techniques described inthe literature or by testing for the presence of the mRNA coding for theNPT II gene by Northern blot analysis of RNA from the tissue of thetransformed plant. Polymerase chain reactions are also used to identifythe presence of a transgene or expression using reverse transcriptasePCR amplification to monitor expression and PCR on genomic DNA. Othercommonly used selection markers include the ampicillin resistance gene,the tetracycline resistance gene and the hygromycin resistance gene.Transformed plant cells thus selected can be induced to differentiateinto plant structures which will eventually yield whole plants. It is tobe understood that a selection marker gene may also be native to aplant.

Transformation

In accordance with the present invention, a transgenic plant is producedthat contains a DNA molecule, comprised of elements as described above,integrated into its genome so that the plant expresses a heterologousgene-encoding DNA sequence. In order to create such a transgenic plant,the expression vectors containing the gene can be introduced intoprotoplasts, into intact tissues, such as immature embryos andmeristems, into callus cultures, or into isolated cells. Preferably,expression vectors are introduced into intact tissues. General methodsof culturing plant tissues are provided, for example, by Miki et al,“Procedures for Introducing Foreign DNA into Plants” in Methods in PlantMolecular Biology and Biotechnology, Glick et al (eds) pp. 67-68 (CRCPress 1993) and by Phillips et al, “Cell/Tissue Culture and In VitroManipulation” in Corn and Corn Improvement 3d Edit. Sprague et al (eds)pp. 345-387 (American Soc. Of Agronomy 1988). The selectable markerincorporated in the DNA molecule allows for selection of transformants.

Methods for introducing expression vectors into plant tissue availableto one skilled in the art are varied and will depend on the plantselected. Procedures for transforming a wide variety of plant speciesare well known and described throughout the literature. See, forexample, Miki et al, supra; Klein et al, Bio/Technology 10:268 (1992);and Weisinger et al., Ann. Rev. Genet. 22: 421-477 (1988). For example,the DNA construct may be introduced into the genomic DNA of the plantcell using techniques such as microprojectile-mediated delivery, Kleinet al., Nature 327: 70-73 (1987); electroporation, Fromm et al., Proc.Natl. Acad. Sci. 82: 5824 (1985); polyethylene glycol (PEG)precipitation, Paszkowski et al., Embo J. 3: 2717-2722 (1984); directgene transfer, WO 85/01856 and EP No. 0 275 069; in vivo protoplasttransformation, U.S. Pat. No. 4,684,611; and microinjection of plantcell protoplasts or embryogenic callus. Crossway, Mol. Gen. Genetics202:179-185 (1985). Co-cultivation of plant tissue with Agrobacteriumtumefaciens is another option, where the DNA constructs are placed intoa binary vector system. Ishida et al., “High Efficiency Transformationof Maize (Zea mays L.) Mediated by Agrobacterium tumefaciens” NatureBiotechnology 14:745-750 (1996). The virulence functions of theAgrobacterium tumefaciens host will direct the insertion of theconstruct into the plant cell DNA when the cell is infected by thebacteria. See, for example Horsch et al., Science 233: 496-498 (1984),and Fraley et al., Proc. Natl. Acad. Sci. 80: 4803 (1983).

Standard methods for transformation of canola are described by Moloneyet al. “High Efficiency Transformation of Brassica napus UsingAgrobacterium Vectors” Plant Cell Reports 8:238-242 (1989). Corntransformation is described by Fromm et al, Bio/Technology 8:833 (1990)and Gordon-Kamm et al, supra. Agrobacterium is primarily used in dicots,but certain monocots such as maize can be transformed by Agrobacterium.U.S. Pat. No. 5,550,318. Rice transformation is described by Hiei etal., “Efficient Transformation of Rice (Oryza sativs L.) Mediated byAgrobacterium and Sequence Analysis of the Boundaries of the T-DNA” ThePlant Journal 6(2): 271-282 (1994), Christou et al, Trends inBiotechnology 10:239 (1992) and Lee et al, Proc. Nat'l Acad. Sci. USA88:6389 (1991). Wheat can be transformed by techniques similar to thoseused for transforming corn or rice. Sorghum transformation is describedby Casas et al, supra and by Wan et al, Plant Physiology 104:37 (1994).Soybean transformation is described in a number of publications,including U.S. Pat. No. 5,015,580.

In one preferred method, the Agrobacterium transformation methods ofIshida supra and also described in U.S. Pat. No. 5,591,616, aregenerally followed, with engineering that the inventors have foundimprove the number of transformants obtained. The Ishida method uses theA188 variety of maize that produces Type I callus in culture. In onepreferred embodiment the High II maize line is used which initiates TypeII embryogenic callus in culture. While Ishida recommends selection onphosphinothricin when using the bar or PAT gene for selection, anotherpreferred embodiment provides for use of bialaphos instead.

The bacterial strain used in the Ishida protocol is LBA4404 with the 40kb super binary plasmid containing three vir loci from the hypervirulentA281 strain. The plasmid has resistance to tetracycline. The cloningvector cointegrates with the super binary plasmid. Since the cloningvector has an E. coli specific replication origin, it cannot survive inAgrobacterium without cointegrating with the super binary plasmid. Sincethe LBA4404 strain is not highly virulent, and has limited applicationwithout the super binary plasmid, the inventors have found in yetanother embodiment that the EHA101 strain is preferred. It is a disarmedhelper strain derived from the hypervirulent A281 strain. Thecointegrated super binary/cloning vector from the LBA4404 parent isisolated and electroporated into EHA 101, selecting for spectinomycinresistance. The plasmid is isolated to assure that the EHA101 containsthe plasmid.

Further, the Ishida protocol as described provides for growing freshculture of the Agrobacterium on plates, scraping the bacteria from theplates, and resuspending in the co-culture medium as stated in the '616patent for incubation with the maize embryos. This medium includes 4.3 gMS salts, 0.5 mg nicotinic acid, 0.5 mg pyridoxine hydrochloride, 1.0 mlthiamine hydrochloride, casamino acids, 1.5 mg 2,4-D, 68.5g sucrose and36 g glucose, all at a pH of 5.8. In a further preferred method, thebacteria are grown overnight in a 1 ml culture, then a fresh 10 mlculture re-inoculated the next day when transformation is to occur. Thebacteria grow into log phase, and are harvested at a density of no morethan OD600=0.5 and is preferably between 0.2 and 0.5. The bacteria arethen centrifuged to remove the media and resuspended in the co-culturemedium. Since Hi II is used, medium preferred for Hi II is used. Thismedium is described in considerable detail by Armstrong, C. I. and GreenC. E. “Establishment and maintenance of friable, embryogenic maizecallus and involvement of L-proline” Planta (1985) 154:207-214. Theresuspension medium is the same as that described above. All further HiII media are as described in Armstrong et al. The result isredifferentiation of the plant cells and regeneration into a plant.Redifferentiation is sometimes referred to as dedifferentiation, but theformer term more accurately describes the process where the cell beginswith a form and identity, is placed on a medium in which it loses thatidentity, and becomes “reprogrammed” to have a new identity. Thus thescutellum cells become embryogenic callus.

It is often desirable to have the DNA sequence in homozygous state whichmay require more than one transformation event to create a parentalline, requiring transformation with a first and second recombinant DNAmolecule both of which encode the same gene product. It is furthercontemplated in some of the embodiments of the process of the inventionthat a plant cell be transformed with a recombinant DNA moleculecontaining at least two DNA sequences or be transformed with more thanone recombinant DNA molecule. The DNA sequences or recombinant DNAmolecules in such embodiments may be physically linked, by being in thesame vector, or physically separate on different vectors. A cell may besimultaneously transformed with more than one vector provided that eachvector has a unique selection marker gene. Alternatively, a cell may betransformed with more than one vector sequentially allowing anintermediate regeneration step after transformation with the firstvector. Further, it may be possible to perform a sexual cross betweenindividual plants or plant lines containing different DNA sequences orrecombinant DNA molecules preferably the DNA sequences or therecombinant molecules are linked or located on the same chromosome, andthen selecting from the progeny of the cross plants containing both DNAsequences or recombinant DNA molecules.

Expression of recombinant DNA molecules containing the DNA sequences andpromoters described herein in transformed plant cells may be monitoredusing Northern blot techniques and/or Southern blot techniques orPCR-based methods known to those of skill in the art.

A large number of plants have been shown capable of regeneration fromtransformed individual cells to obtain transgenic whole plants. Corn haslong been a successful plant transformation recipient. Fromm, et al.,Bio Technology, 8:33 (1990). Others are as follows. For example,regeneration has been shown for dicots as follows: apple, Malus pumila(James et al., Plant Cell Reports (1989) 7:658); blackberry, Rubus,Blackberry/raspberry hybrid, Rubus, red raspberry, Rubus (Graham et al.,Plant Cell, Tissue and Organ Culture (1990) 20:35); carrot, Daucuscarota (Thomas et al., Plant Cell Reports (1989) 8:354; Wurtele andBulka, Plant Science (1989) 61:253); cauliflower, Brassica oleracea(Srivastava et al., Plant Cell Reports (1988) 7:504); celery, Apiumgraveolens (Catlin et al., Plant Cell Reports (1988) 7:100); cucumber,Cucumis sativus (Trulson et al., Theor. Appl. Genet. (1986) 73:11);eggplant, Solanum melonoena (Guri and Sink, J. Plant Physiol. (1988)133:52) lettuce, Lactuca sativa (Michelmore et al., Plant Cell Reports(1987) 6:439); potato, Solanum tuberosum (Sheerman and Bevan, Plant CellReports (1988) 7:13); rape, Brassica napus (Radke et al., Theor. Appl.Genet. (1988) 75:685; Moloney et al., Plant Cell Reports (1989) 8:238);soybean (wild), Glycine canescens (Rech et al., Plant Cell Reports(1989) 8:33); strawberry, Fragaria ×ananassa (Nehra et al., Plant CellReports (1990) 9:10; tomato, Lycopersicon esculentum (McCormick et al.,Plant Cell Reports (1986) 5:81); walnut, Juglans regia (McGranahan etal., Plant Cell Reports (1990) 8:512); melon, Cucumis melo (Fang et al.,86th Annual Meeting of the American Society for Horticultural ScienceHort. Science (1989) 24:89); grape, Vitis vinifera (Colby et al.,Symposium on Plant Gene Transfer, UCLA Symposia on Molecular andCellular Biology J Cell Biochem Suppl (1989) 13D:255; mango, Mangiferaindica (Mathews, et al., symposium on Plant Gene Transfer, UCLA Symposiaon Molecular and Cellular Biology J Cell Biochem Suppl (1989) 13D:264);and for the following monocots: rice, Oryza sativa (Shimamoto et al.,Nature (1989) 338:274); rye, Secale cereale (de la Pena et al., Nature(1987) 325:274); maize, (Rhodes et al., Science (1988) 240:204).

In addition, regeneration of whole plants from cells (not necessarilytransformed) has been observed in apricot, Prunus armeniaca (Pieterse,Plant Cell Tissue and Organ Culture (1989) 19:175); asparagus, Asparagusofficinalis (Elmer et al., J. Amer. Soc. Hort. Sci. (1989) 114:1019);Banana, hybrid Musa (Escalant and Teisson, Plant Cell Reports (1989)7:665); bean, Phaseolus vulgaris (McClean and Grafton, Plant Science(1989) 60:117); cherry, hybrid Prunus (Ochatt et al., Plant Cell Reports(1988) 7:393); grape, Vitis vinifera (Matsuta and Hirabayashi, PlantCell Reports, (1989) 7:684; mango, Mangifera indica (DeWald et al., JAmer Soc Hort Sci (1989) 114:712); melon, Cucumis melo (Moreno et al.,Plant Sci letters (1985) 34:195); ochra, Abelmoschus esculentus (Roy andMangat, Plant Science (1989) 60:77; Dirks and van Buggenum, Plant CellReports (1989) 7:626); onion, hybrid Allium (Lu et al., Plant CellReports (1989) 7:696); orange, Citrus sinensis (Hidaka and Kajikura,Scientia Horiculturae (1988) 34:85); papaya, Carrica papaya (Litz andConover, Plant Sci Letters (1982) 26:153); peach, Prunus persica andplum, Prunus domestica (Mante et al., Plant Cell Tissue and OrganCulture (989) 19:1); pear, Pyrus communis (Chevreau et al., Plant CellReports (1988) 7:688; Ochatt and Power, Plant Cell Reports (1989)7:587); pineapple, Ananas comosus (DeWald et al., Plant Cell Reports(1988) 7:535); watermelon, Citrullus vulgaris (Srivastava et al., PlantCell Reports (1989) 8:300); wheat, Triticum aestivum (Redway et al.,Plant Cell Reports (1990) 8:714).

The regenerated plants are transferred to standard soil conditions andcultivated in a conventional manner. After the expression or inhibitioncassette is stably incorporated into regenerated transgenic plants, itcan be transferred to other plants by sexual crossing. Any of a numberof standard breeding techniques can be used, depending upon the speciesto be crossed.

It may be useful to generate a number of individual transformed plantswith any recombinant construct in order to recover plants free from anyposition effects. It may also be preferable to select plants thatcontain more than one copy of the introduced recombinant DNA moleculesuch that high levels of expression of the recombinant molecule areobtained.

According to a preferred embodiment, the transgenic plant provided forcommercial production of foreign protein is maize. In another preferredembodiment, the biomass of interest is seed. For the relatively smallnumber of transgenic plants that show higher levels of expression, agenetic map can be generated, primarily via conventional RestrictionFragment Length Polymorphisms (RFLP), Polymerase Chain Reaction (PCR)analysis, and Simple Sequence Repeats (SSR) which identifies theapproximate chromosomal location of the integrated DNA molecule. Forexemplary methodologies in this regard, see Glick and Thompson, METHODSIN PLANT MOLECULAR BIOLOGY AND BIOTECHNOLOGY 269-284 (CRC Press, BocaRaton,1993) . Map information concerning chromosomal location is usefulfor proprietary protection of a subject transgenic plant. Ifunauthorized propagation is undertaken and crosses made with othergermplasm, the map of the integration region can be compared to similarmaps for suspect plants, to determine if the latter have a commonparentage with the subject plant. Map comparisons would involvehybridizations, RFLP, PCR, SSR and sequencing, all of which areconventional techniques.

As indicated above, it may be desirable to produce plant lines which arehomozygous for a particular gene. In some species this is accomplishedrather easily by the use of anther culture or isolated microsporeculture. This is especially true for the oil seed crop Brassica napus(Keller and Armstrong, Z. flanzenzucht 80:100-108, 1978). By using thesetechniques, it is possible to produce a haploid line that carries theinserted gene and then to double the chromosome number eitherspontaneously or by the use of cochicine. This gives rise to a plantthat is homozygous for the inserted gene, which can be easily assayedfor if the inserted gene carries with it a suitable selection markergene for detection of plants carrying that gene. Alternatively, plantsmay be self-fertilized, leading to the production of a mixture of seedthat consists of, in the simplest case, three types, homozygous (25%),heterozygous (50%) and null (25%) for the inserted gene. Although it isrelatively easy to score null plants from those that contain the gene,it is possible in practice to score the homozygous from heterozygousplants by Southern blot analysis in which careful attention is paid tothe loading of exactly equivalent amounts of DNA from the mixedpopulation, and scoring heterozygotes by the intensity of the signalfrom a probe specific for the inserted gene. It is advisable to verifythe results of the southern blot analysis by allowing each independenttransformant to self-fertilize, since additional evidence forhomozygosity can be obtained by the simple fact that if the plant washomozygous for the inserted gene, all of the subsequent plants from theselfed seed will contain the gene, while if the plant was heterozygousfor the gene, the generation grown from the selfed seed will containnull plants. Therefore, with simple selfing one can easily selecthomozygous plant lines that can also be confirmed by southern blotanalysis.

Creation of homozygous parental lines makes possible the production ofhybrid plants and seeds which will contain a engineered proteincomponent. Transgenic homozygous parental lines are maintained with eachparent containing either the first or second recombinant DNA sequenceoperably linked to a promoter. Also incorporated in this scheme are theadvantages of growing a hybrid crop, including the combining of morevaluable traits and hybrid vigor.

The following examples serve to better illustrate the inventiondescribed herein and are not intended to limit the invention in any way.All references cited herein are hereby expressly incorporated to thisdocument in their entirety by reference.

EXAMPLES

Methods

Construction of Ubi-1 Promoter Variants

The DNA construct PHP8904 (Pioneer Hi-Bred; Johnston, Iowa), containsthe GUS reporter gene positioned 3′ to approximately 0.9 kb of 5′flanking sequence of maize Ubi-1, plus the Ubi-1 5′ untranslated leadersequence and first intron. The potato proteinase inhibitor IItranscription terminator region is present 3′ of GUS. PHP8904 alsocarries right and left border sequences of an Agrobacterium tumefaciensTi plasmid, bacterial antibiotic resistance and origin of replicationsequences, and the bar gene of Streptomyces hygroscopicus, conferringresistance to the herbicide bialaphos. The construct PGN7062 isessentially identical to 8904, except that the GUS reporter geneincludes sequences encoding six C-terminal histidine residues. Allsubsequent constructs are similar to PGN7062 but have engineering inUbi-1 5′ flanking sequences (Table 1). For each Ubi-1 5′ flankingsequence variant, a series of oligonucleotides were generated thattogether span the putative heat shock elements. These oligonucleotideswere assembled and the sequences amplified by the polymerase chainreaction. The DNA fragments were introduced into the cloning vectorpCR2.1 (Invitrogen; Carlsbad, Calif.). SalI-BglII restriction enzymegenerated DNA fragments spanning the engineered HSEs were isolated fromthe pCR2.1 based plasmids and were transferred into an intermediate PGEM(Promega Corporation; Madison, Wis.) based plasmid, PGN5796, soreplacing corresponding wild type Ubi-1 5′ flanking sequence.HindIII-NheI restriction enzyme generated DNA fragments, spanning theentire Ubi-1 5′ flanking sequence and 5′ untranslated region plus partof the first intron, were then transferred into PGN7062, so replacingcorresponding wild type Ubi-1 sequence.

TABLE 1 Engineered Ubi-1 promoter HSE DNA Trans- con- descrip- genicstruct DNA sequence¹ tion lines PGN7062 CTGGACCCCTCTCGAGAGTTCCGCT wildGSB (SEQ ID NO: 1) type PGN7547 --------------------------- HSEs GSCdeleted PGN7565 CTGGACCCCTCTCGA---------- 3′ HSE GSD (SEQ ID NO: 2)deleted PGN7583 ---------- CTCGAGAGTTCCGCT 5′ HSE GSE (SEQ ID NO: 3)deleted PGN7600 CTGGACCCCTCTCGAC TCGAGAGTTCC HSEs GSF GCT (SEQ ID NO: 4)adjacent PGN8926 3x(GACACGTAGAATGACTCATCAC) HSEs GSG (SEQ ID NO: 5)replaced by Ps1 trimer ¹The 5′ HSE is in bold type and the 3′ HSE isunderlined.Transient Transformation

Transient transformations using Agrobacterium tumefaciens were performedusing sonication-assisted Agrobacterium transformation as described byTrick and Finer (Trick, H. N. et al. (1997) SAAT: Sonication assistedAgrobacterium-mediated transformation”, Transgenic Res. 6:329-336). Tenimmature zygotic embryos per tube were sonicated in the presence ofAgrobacterium tumefaciens EHA 101 (pSB111) at an O. D._(600 nm) of 0.5for 30 s, were placed onto co-cultivation medium and were incubated for5 days. Embryos were stained for 24 hours with 5 mgml⁻¹ X-gluC(5-bromo-4-chloro-3-indolyl-β-D-glucoronic acid: cyclohexyl ammoniumsalt) (Inalco; Milan, Italy) dissolved in Jefferson's buffer (Jefferson,R. A. (1987), “Assaying chimeric genes in plants: the GUS gene fusionsystem”, Plant Molec. Biol. Reporter 5:387-405). They were subsequentlytransferred to 70% ethanol.

Transformation, Tissue Culture and Plant Growth

The procedure for stable transformation followed a engineered version ofIshida et al. (Ishida, Y. et al. (1996), “High efficiency transformationof maize (Zea mays L.) mediated by Agrobacterium tumefaciens”, NatureBiotech 14:745-750) and Armstrong and Green (Armstrong, C. L., et al.(1985) “Establishment and maintenance of friable, embryogenic maizecallus and the involvement of L-proline”, Planta 164:207-214).Transformation and regeneration media are described in Table 2. Immaturezygotic embryos were isolated from Hi-II maize kernels at 12 days afterpollination and were transformed with Agrobacterium tumefaciens strainEHA 101 containing the engineered Ubi-1 variant constructs. ForAgrobacterium infection, bacteria were grown overnight in YEP liquidmedium supplemented with antibiotic. Agrobacterium were thenre-inoculated into YEP supplemented with 100 mgl⁻¹ kanamycin and 100mgl⁻¹ spectinomicin and were grown to an OD_(550 nm) of 0.4-0.6. TheAgrobacterium culture was centrifuged to remove the media and the pelletwas resuspended in inoculation medium. Immature zygotic embryos werewashed with inoculation medium and immersed in the Agrobacteriumsolution, vortexed for 30 s, incubated for 5 minutes and plated andco-cultivated for 4 days on solid co-culture medium. Embryos weretransferred for three days onto non-selective medium supplemented with100 mg/l carbenicillin, and then subcultured to Bialaphos selectionmedium and subsequently subcultured every two weeks. Embryogenic tissue(events) proliferating on selection media were excised and cultured onthe same medium for proliferation for four weeks and were thensubcultured onto regeneration medium for three weeks to allow embryoformation. Embryos were picked and transferred to germination medium forone to two weeks with light at 28° C. Plants that regenerated weretransferred to tubes for root and shoot elongation. Multiple T0 plantswere regenerated from embryogenic tissues that were selected onBialaphos and these were transferred to a greenhouse. T0 plants werecrossed with elite inbred lines to produce T1 seeds. For analysis of T1leaves, T1 seeds were germinated in a greenhouse and were leaf paintedwith a 1% active ingredient of Finale® for selection of transformedplants. Leaf samples were collected three weeks after germination.

Preparation of Plant Extracts

For seed extracts, individual dry seeds were pulverized using a hammerand extracted with 500 μl of lysis buffer (50 mM sodium phosphate pH7.0, 1 mM EDTA, 10 mM β-mercaptoethanol). Samples were placed inextraction tubes, each with a ball bearing added, in a Beckman rack andwere homogenized in a high speed shaker for 20 s. Samples werecentrifuged, and the supernatants recovered and stored on ice prior toanalysis. For leaf extracts, small portions of the ends of leaves werecut off and pulverized under liquid nitrogen. Weights were recorded andextractions completed using lysis buffer at 10 μl per mg of sample. Forcallus extracts, samples were extracted using lysis buffer at 1 μl permg of callus tissue. Protein concentrations were determined accordingBradford (1976).

β-glucuronidase Activity Assay

GUS assays were performed as described by Jefferson (1987, supra). Totalsoluble protein (1 μg) was incubated in 100 μl of lysis buffer and thereaction initiated with 25 μl of 5 mM 4-methylumbelliferylβ-D-glucuronide (MUG, Sigma M-9130). The reaction was incubated for upto 20 minutes at 37° C. At specific time points 25 μl volumes of thereaction mixture were transferred to a Dynatech Microfluor reading platethat had 175 μl of stop buffer (0.2M Na₂CO₃) per well. Fluorescence wasmeasured at an excitation wavelength of 360 nm and an emissionwavelength of 460 nm on a Microplate Fluorometer (Cambridge Technologies7625). GUS protein levels were then calculated by comparison to astandard curve of 1 -100 μM 4-methylumbelliferyl (MU, Sigma M-1508).

Results

Conserved Ubi-1 Promoter Sequences are not Required for TransientExpression in Maize Embryos

To investigate whether engineered versions of the maize Ubi-1 promoterwould facilitate high levels of constitutive expression, we generated aseries of fusions of native or engineered Ubi-1 sequences to the GUSreporter gene for introduction into plants. The putative HSEs of theUbi-1 promoter were removed, their relative spacing was altered or theywere substituted with a trimer of a seed specific element from thepromoter of the pea lectin gene Ps1 (Table 1).

The promoter variants were first assessed in a transient transformationsystem. The DNA constructs were introduced into zygotic embryos of maizeand GUS activity was detected qualitatively by histochemical staining.

TABLE 2 Mean GUS expression score Transient transformants (relativevalue) GSB 2.1 GSC 2.0 GSD 1.7 GSE 2.5 GSF 2.1 GSG 2.2 promoterless GUS0.0 no vector 0.0 *Score system: 3 = high, 2 = medium; 1 = low; 0 =nothing

In all cases, GUS is synthesized, indicating that none of theengineering to the Ubi-1 promoter knock out expression. However, embryostransiently transformed with PGN8926 produce much less GUS than thosetransformed with the other constructs.

Engineering of Conserved Ubi-1 Promoter Sequences can IncreaseExpression in Stable Transformed Lines of Maize

To more accurately assess the engineered Ubi-1 promoter variants, stabletransformed lines were developed. The series of GUS fusions wereintroduced into zygotic embryos of maize to generate stabletransformation events. Multiple seedlings were regenerated fromembryogenic callus tissue of each event to give transformed lines, andseedlings matured and flowered to generate T1 seeds. GUS activity wasdetermined in embryogenic callus tissue, leaves of seedlings regeneratedfrom tissue culture and T1 seeds. The native Ubi-1 5′ flanking sequenceand the promoter variants of Ubi-1 all drive GUS expression in eachtissue type, but levels of GUS are much lower in embryogenic callus andin leaf tissue than in seeds.

Among plants derived from any specific transformation event,considerable variation in the level of GUS expression exists in leavesof regenerated seedlings and in T1 seeds. In addition, GUS expression inembryogenic callus tissue, leaves of regenerated seedlings and T1 seedsvaries between different transformation events.

However, focusing on T1 seed, which is the preferred site of expressionfor the commercial production of foreign proteins in corn, there aresignificant differences in mean levels of expression between transformedlines carrying the engineered promoters. GSD and GSG lines have levelsof GUS expression similar to the control GSB line, but surprisingly,GSC, GSE and GSF lines have elevated expression levels (FIG. 1C). Aranking of GUS expression levels in T1 seed between lines transformedwith the promoter variants is similar whether mean or highest recordedexpression levels are considered (FIG. 2).

Ubi-1 Promoter Variants Drive Constitutive Expression but Have TissuePreferences in the Kernel

Maize Ubi-1 is constitutively expressed and the Ubi-1 promoter can drivethe constitutive expression of reporter genes in transgenic plants(Christensen, A. H., et al. (1992), “Maize polyubiquitin genes:structure, thermal perturbation of expression and transcript splicing,and promoter activity following transfer to protoplasts byelectroporation”, Plant Mol. Biol. 18:675-689; Takimoto, I., et al.(1994), “Non-systemic expression of a stress-responsive maizepolyubiquitin gene (Ubi-1) in transgenic rice plants”, Plant Mol. Biol.26:1007-1012; Christensen, A. H., et al. (1996), “Ubiquitinpromoter-based vectors for high-level expression of selectable and/orscreenable marker genes in monocotyledonous plants”, Transgenic Res.5:213-218). The engineered Ubi-1 promoter variants generated here driveexpression of GUS in embryogenic callus tissue and leaves of seedlingsregenerated from tissue culture. To examine whether the promotervariants cause constitutive expression in plants germinated from seed, aselection of transformed lines that express GUS at a high level in T1seeds were analyzed. GUS activity was determined in leaf tissue ofdeveloping T1 seedlings and was compared to the activity that had beenrecorded for T1 seeds (FIG. 3). GUS was detected in leaves oftransformed lines carrying all engineered Ubi-1 promoter variants, butexpression was much lower than in seeds. Due to small selected samplesizes, there is considerable variation in the expression data amonglines carrying each engineered promoter variant. However, the ranking ofGUS expression in leaf tissue among the variants reflects the ranking inseed, except that in GSC lines expression ranks higher in seed than inleaf tissue.

The activity of the Ubi-1 promoter variants was also assessed visuallyin various tissues. Selected T1 kernels were either directly analyzed bycutting into sections and staining for GUS activity, or were germinatedto generate seedlings of which root and leaf tissues were analyzed. GUSactivity is observed in leaves and roots with all transformed lines, andin both tissue types is highest for GSE (5′ HSE deleted) and GSF (HSE'sadjacent) lines. In kernels of GSB lines GUS expression is higher in theembryo than in the endosperm. In transformed lines carrying the Ubi-1promoter variants the distribution of GUS activity is more uniformacross the seed, indicating increased expression in the endospermcompared to lines transformed with the native promoter sequence.

TABLE 3 Root and leaf qualitative data Construct Mean root Mean leaf GSB2.0 2.0 GSC 2.3 2.7 GSD 3.0 2.8 GSE 3.7 3.6 GSF 4.0 3.7 GSG 3.0 2.0*Score on a scale of 0 to 4 (−, +/−, +, ++, +++)

Tissue specific expression within the seed was further investigated bydissecting apart embryos and endosperm, and then determining expressionlevels separately. GSB (wild type) lines have a strong tissue type binsin the expression of GUS, with over 90% of the total activity in theembryo. GSD(3′ HSE deleted), GSE (5′ HSE deleted) and GSF lines show alesser degree of embryo preferred expression, GSD (3′ HSE deleted) lineshave a similar level of GUS in each tissue and GSG (HSE's replaced byPsI trimer) lines have much more GUS in the endosperm. IN fact, with GSG(HSE's replaced by PsI trimer) lines the activity of the engineeredUbi-1 promoter is similar in the embryo and endosperm, but since theendosperm is about 7.5-fold larger than the embryo, most of the GUS isin the embryo.

TABLE 4 Proportion Transformants Seed fraction of GUS GSB embryo 0.92endosperm 0.08 GSC embryo 0.89 endosperm 0.11 GSD embryo 0.47 endosperm0.53 GSE embryo 0.83 endosperm 0.17 GSF embryo 0.21 endosperm 0.79 GSGembryo 0.15 endosperm 0.85

As can be seen, the expression for GSC, GSD, GSE, GSF and GSG all hadaltered ratios of embryo/endosperm expression. GSD had almost 50/50 andGSG had the ratio reversed with endosperm expression preferred.

Discussion

Several maize Ubi-1 promoter sequences with engineering to the putativeHSEs were used to drive GUS expression in transgenic corn seed.Surprisingly, deletion or engineering of the elements does notsignificantly reduce expression of a reporter gene. Rather, with someUbi-1 promoter variants, expression of GUS is increased. Deletion ofboth putative HSEs or of the 5′ element alone significantly increasesexpression, as does placing the elements adjacent so that they no longeroverlap. Thus, engineering to the 5′ putative HSE increase the level ofexpression in seed. In the case of re-positioning the elements to removeoverlap, the affect may be to inadvertently diminish the activity of the5′ putative HSE by altering immediately adjacent sequence. Since removalor engineering of the 5′ element appears to increase expression of areporter gene in seed, the element may restrict expression understandard growth conditions in the context of the native Ubi-1 promoter.Surprisingly, replacement of the putative HSEs with a trimer of a 22base pair sequence from the promoter of the pea lectin gene, Ps1, doesnot lead to increased expression. Although the Ps1 derived element doesnot include a HSE consensus sequence, it does include a five out ofseven base pair match to the sequence GACCCCT within the 5′ putative HSEof the Ubi-1 promoter, and this sequence may substitute for the 5′element.

The wild type Ubi-1 sequence analyzed here drives constitutiveexpression of GUS. Expression is observed in leaf and root tissue and isparticularly high in seed tissue. Within the kernel expression is seenin both embryo and endosperm tissues, but is preferred in the embryo.This seed and specifically embryo-preferred expression is in agreementwith previous work using Ubi-1 promoter sequences in stable transformedlines (Hood, E. E., et al. (1997), “Commercial production of avidin fromtransgenic maize: characterization of transformant, production,processing, extraction and purification”, Mol. Breed. 3:291-306; Witcheret al., 1998, supra; Zhong et al., 1999, supra) and in embryostransiently transformed by microparticle bombardment Like the wild typesequence all of the Ubi-1 promoter variants examined here causeconstitutive expression, with GUS being synthesized in leaf, root andespecially seed tissue. However, within the kernel, there are notabledifferences in the balance of expression between embryo and endospermtissue. None of the Ubi-1 promoter variants are as strongly embryobiased as the wild type sequence, indicating that within the kernel, theputative HSEs favor expression in embryo tissue.

Replacing the putative HSEs with a trimer of the Ps1 promoter elementresults in similar promoter activity in embryo and endosperm tissue,thus because of the relative tissue mass, a greater accumulation oftransgene product in the endosperm. When fused to a minimal promoter,the Ps1 trimer confers seed-preferred expression in tobacco (dePater,S., et al. (1994), “A 22-bp fragment of the pea lectin promotercontaining essential TGAC-like motifs confers seed-specific geneexpression”, Plant Cell 5:877-886dePater, S., et al. (1996), “The 22 bpW1 element in the pea lectin promoter is necessary and, as a multimer,sufficient for high gene expression in tobacco seeds”, Plant Mol. Biol.32:515-523), and the basic domain/leucine zipper proteins TGA1a andOpaque-2 can bind this sequence in vitro (dePater, S., et al. (1994),“bZIP proteins bind to a palindromic sequence without and ACGT corelocated in a seed-specific element of the pea lectin promoter”, Plant J.6:133-140). Opaque-2 is a well characterized transcription factor ofmaize endosperm, and may be binding to the Ps1 trimer introduced intothe Ubi-1 promoter, so facilitating expression in the endosperm. Sincethe overall level of transgene expression in the seed is similar inlines transformed with native Ubi-1 sequences, or with a promoter inwhich a Ps1 trimer replaces the HSEs, the Ps1 trimer must act to reduceexpression in the embryo, as well as to increase expression in theendosperm.

1. An engineered isolated ubiquitin promoter sequence capable ofdirecting expression of a nucleotide sequence in a plant cell, saidengineered isolated ubiquitin promoter sequence comprising: a heat shockregion, wherein said heat shock region has the sequence as set forth inSEQ ID NO:
 4. 2. A method for causing expression of a heterologousstructural gene or open reading frame in a plant cell, said methodcomprising: introducing to a plant cell an expression constructcomprising an engineered isolated ubiquitin promoter sequence operablylinked to said heterologous structural gene or open reading frame,wherein said engineered isolated ubiquitin promoter sequence comprises aheat shock region, wherein said heat shock region has the sequence asset forth in SEQ ID NO:
 4. 3. An isolated modified ubiquitin promotersequence comprising: bases 1-1990 of SEQ ID NO:15 which is capable ofdirecting expression of an operably linked sequence in a plant cell,said isolated sequence having been modified so as to delete SEQ ID NO:1.